Serveur d'exploration sur Pittsburgh

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Completing the Results of the 2013 Boston Marathon

Identifieur interne : 003748 ( Main/Exploration ); précédent : 003747; suivant : 003749

Completing the Results of the 2013 Boston Marathon

Auteurs : Dorit Hammerling [États-Unis] ; Matthew Cefalu [États-Unis] ; Jessi Cisewski [États-Unis] ; Francesca Dominici [États-Unis] ; Giovanni Parmigiani [États-Unis] ; Charles Paulson [États-Unis] ; Richard L. Smith [États-Unis]

Source :

RBID : PMC:3984103

Descripteurs français

English descriptors

Abstract

The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon's organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting finish times for the runners who could not complete the race. With assistance from the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, as well as all runners from the 2010 and 2011 Boston marathons. The data consist of split times from each of the 5 km sections of the course, as well as the final 2.2 km (from 40 km to the finish). The statistical objective is to predict the missing split times for the runners who failed to finish in 2013. We set this problem in the context of the matrix completion problem, examples of which include imputing missing data in DNA microarray experiments, and the Netflix prize problem. We propose five prediction methods and create a validation dataset to measure their performance by mean squared error and other measures. The best method used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality. We show how the results were used to create projected times for the 2013 runners and discuss potential for future application of the same methodology. We present the whole project as an example of reproducible research, in that we are able to make the full data and all the algorithms we have used publicly available, which may facilitate future research extending the methods or proposing completely different approaches.


Url:
DOI: 10.1371/journal.pone.0093800
PubMed: 24727904
PubMed Central: 3984103


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Completing the Results of the 2013 Boston Marathon</title>
<author>
<name sortKey="Hammerling, Dorit" sort="Hammerling, Dorit" uniqKey="Hammerling D" first="Dorit" last="Hammerling">Dorit Hammerling</name>
<affiliation wicri:level="2">
<nlm:aff id="aff1">
<addr-line>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Cefalu, Matthew" sort="Cefalu, Matthew" uniqKey="Cefalu M" first="Matthew" last="Cefalu">Matthew Cefalu</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Cisewski, Jessi" sort="Cisewski, Jessi" uniqKey="Cisewski J" first="Jessi" last="Cisewski">Jessi Cisewski</name>
<affiliation wicri:level="4">
<nlm:aff id="aff3">
<addr-line>Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Dominici, Francesca" sort="Dominici, Francesca" uniqKey="Dominici F" first="Francesca" last="Dominici">Francesca Dominici</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Parmigiani, Giovanni" sort="Parmigiani, Giovanni" uniqKey="Parmigiani G" first="Giovanni" last="Parmigiani">Giovanni Parmigiani</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
<affiliation wicri:level="2">
<nlm:aff id="aff4">
<addr-line>Dana Farber Cancer Institute, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Dana Farber Cancer Institute, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Paulson, Charles" sort="Paulson, Charles" uniqKey="Paulson C" first="Charles" last="Paulson">Charles Paulson</name>
<affiliation wicri:level="2">
<nlm:aff id="aff5">
<addr-line>Puffinware LLC, State College, Pennsylvania, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Puffinware LLC, State College, Pennsylvania</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Smith, Richard L" sort="Smith, Richard L" uniqKey="Smith R" first="Richard L." last="Smith">Richard L. Smith</name>
<affiliation wicri:level="2">
<nlm:aff id="aff1">
<addr-line>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<nlm:aff id="aff6">
<addr-line>Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
<settlement type="city">Chapel Hill (Caroline du Nord)</settlement>
</placeName>
<orgName type="university">Université de Caroline du Nord à Chapel Hill</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">24727904</idno>
<idno type="pmc">3984103</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984103</idno>
<idno type="RBID">PMC:3984103</idno>
<idno type="doi">10.1371/journal.pone.0093800</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">001D18</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">001D18</idno>
<idno type="wicri:Area/Pmc/Curation">001C93</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">001C93</idno>
<idno type="wicri:Area/Pmc/Checkpoint">001820</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">001820</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="wicri:Area/PubMed/Corpus">003368</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">003368</idno>
<idno type="wicri:Area/PubMed/Curation">003345</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">003345</idno>
<idno type="wicri:Area/PubMed/Checkpoint">003345</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">003345</idno>
<idno type="wicri:Area/Ncbi/Merge">001A96</idno>
<idno type="wicri:Area/Ncbi/Curation">001A96</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001A96</idno>
<idno type="wicri:Area/Main/Merge">003922</idno>
<idno type="wicri:Area/Main/Curation">003748</idno>
<idno type="wicri:Area/Main/Exploration">003748</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Completing the Results of the 2013 Boston Marathon</title>
<author>
<name sortKey="Hammerling, Dorit" sort="Hammerling, Dorit" uniqKey="Hammerling D" first="Dorit" last="Hammerling">Dorit Hammerling</name>
<affiliation wicri:level="2">
<nlm:aff id="aff1">
<addr-line>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Cefalu, Matthew" sort="Cefalu, Matthew" uniqKey="Cefalu M" first="Matthew" last="Cefalu">Matthew Cefalu</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Cisewski, Jessi" sort="Cisewski, Jessi" uniqKey="Cisewski J" first="Jessi" last="Cisewski">Jessi Cisewski</name>
<affiliation wicri:level="4">
<nlm:aff id="aff3">
<addr-line>Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Dominici, Francesca" sort="Dominici, Francesca" uniqKey="Dominici F" first="Francesca" last="Dominici">Francesca Dominici</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Parmigiani, Giovanni" sort="Parmigiani, Giovanni" uniqKey="Parmigiani G" first="Giovanni" last="Parmigiani">Giovanni Parmigiani</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
<affiliation wicri:level="2">
<nlm:aff id="aff4">
<addr-line>Dana Farber Cancer Institute, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Dana Farber Cancer Institute, Boston, Massachusetts</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Paulson, Charles" sort="Paulson, Charles" uniqKey="Paulson C" first="Charles" last="Paulson">Charles Paulson</name>
<affiliation wicri:level="2">
<nlm:aff id="aff5">
<addr-line>Puffinware LLC, State College, Pennsylvania, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Puffinware LLC, State College, Pennsylvania</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Smith, Richard L" sort="Smith, Richard L" uniqKey="Smith R" first="Richard L." last="Smith">Richard L. Smith</name>
<affiliation wicri:level="2">
<nlm:aff id="aff1">
<addr-line>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<nlm:aff id="aff6">
<addr-line>Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
<settlement type="city">Chapel Hill (Caroline du Nord)</settlement>
</placeName>
<orgName type="university">Université de Caroline du Nord à Chapel Hill</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Humans</term>
<term>Physical Endurance (physiology)</term>
<term>Running (physiology)</term>
<term>Sports</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Course à pied (physiologie)</term>
<term>Endurance physique (physiologie)</term>
<term>Humains</term>
<term>Sports</term>
</keywords>
<keywords scheme="MESH" qualifier="physiologie" xml:lang="fr">
<term>Course à pied</term>
<term>Endurance physique</term>
</keywords>
<keywords scheme="MESH" qualifier="physiology" xml:lang="en">
<term>Physical Endurance</term>
<term>Running</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Humans</term>
<term>Sports</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Humains</term>
<term>Sports</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon's organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting finish times for the runners who could not complete the race. With assistance from the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, as well as all runners from the 2010 and 2011 Boston marathons. The data consist of split times from each of the 5 km sections of the course, as well as the final 2.2 km (from 40 km to the finish). The statistical objective is to predict the missing split times for the runners who failed to finish in 2013. We set this problem in the context of the matrix completion problem, examples of which include imputing missing data in DNA microarray experiments, and the Netflix prize problem. We propose five prediction methods and create a validation dataset to measure their performance by mean squared error and other measures. The best method used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality. We show how the results were used to create projected times for the 2013 runners and discuss potential for future application of the same methodology. We present the whole project as an example of reproducible research, in that we are able to make the full data and all the algorithms we have used publicly available, which may facilitate future research extending the methods or proposing completely different approaches.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Candes, Ej" uniqKey="Candes E">EJ Candès</name>
</author>
<author>
<name sortKey="Recht, B" uniqKey="Recht B">B Recht</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Candes, Ej" uniqKey="Candes E">EJ Candès</name>
</author>
<author>
<name sortKey="Tao, T" uniqKey="Tao T">T Tao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mazumder, R" uniqKey="Mazumder R">R Mazumder</name>
</author>
<author>
<name sortKey="Hastie, T" uniqKey="Hastie T">T Hastie</name>
</author>
<author>
<name sortKey="Tibshirani, R" uniqKey="Tibshirani R">R Tibshirani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Troyanskaya, O" uniqKey="Troyanskaya O">O Troyanskaya</name>
</author>
<author>
<name sortKey="Cantor, M" uniqKey="Cantor M">M Cantor</name>
</author>
<author>
<name sortKey="Sherlock, G" uniqKey="Sherlock G">G Sherlock</name>
</author>
<author>
<name sortKey="Brown, P" uniqKey="Brown P">P Brown</name>
</author>
<author>
<name sortKey="Hastie, T" uniqKey="Hastie T">T Hastie</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Caroline du Nord</li>
<li>Massachusetts</li>
<li>Pennsylvanie</li>
</region>
<settlement>
<li>Chapel Hill (Caroline du Nord)</li>
<li>Pittsburgh</li>
</settlement>
<orgName>
<li>Université Carnegie-Mellon</li>
<li>Université de Caroline du Nord à Chapel Hill</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Caroline du Nord">
<name sortKey="Hammerling, Dorit" sort="Hammerling, Dorit" uniqKey="Hammerling D" first="Dorit" last="Hammerling">Dorit Hammerling</name>
</region>
<name sortKey="Cefalu, Matthew" sort="Cefalu, Matthew" uniqKey="Cefalu M" first="Matthew" last="Cefalu">Matthew Cefalu</name>
<name sortKey="Cisewski, Jessi" sort="Cisewski, Jessi" uniqKey="Cisewski J" first="Jessi" last="Cisewski">Jessi Cisewski</name>
<name sortKey="Dominici, Francesca" sort="Dominici, Francesca" uniqKey="Dominici F" first="Francesca" last="Dominici">Francesca Dominici</name>
<name sortKey="Parmigiani, Giovanni" sort="Parmigiani, Giovanni" uniqKey="Parmigiani G" first="Giovanni" last="Parmigiani">Giovanni Parmigiani</name>
<name sortKey="Parmigiani, Giovanni" sort="Parmigiani, Giovanni" uniqKey="Parmigiani G" first="Giovanni" last="Parmigiani">Giovanni Parmigiani</name>
<name sortKey="Paulson, Charles" sort="Paulson, Charles" uniqKey="Paulson C" first="Charles" last="Paulson">Charles Paulson</name>
<name sortKey="Smith, Richard L" sort="Smith, Richard L" uniqKey="Smith R" first="Richard L." last="Smith">Richard L. Smith</name>
<name sortKey="Smith, Richard L" sort="Smith, Richard L" uniqKey="Smith R" first="Richard L." last="Smith">Richard L. Smith</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Amérique/explor/PittsburghV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003748 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 003748 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Amérique
   |area=    PittsburghV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:3984103
   |texte=   Completing the Results of the 2013 Boston Marathon
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:24727904" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a PittsburghV1 

Wicri

This area was generated with Dilib version V0.6.38.
Data generation: Fri Jun 18 17:37:45 2021. Site generation: Fri Jun 18 18:15:47 2021